PCSM-167: Restore eventsRead during recovery#150
Merged
inelpandzic merged 3 commits intomainfrom Nov 18, 2025
Merged
Conversation
boris-ilijic
approved these changes
Nov 17, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
During recovery
eventsReadwere not restored from the checkpoint and thus lead to a weird status after recovery, where we applied more events than we read:That is because when we recover, we have
eventsAppliedcorrect andeventsReadset to zero.Problem on top of this is also
eventsReadvalue was also wrongly stored to the checkpoint, because the value which was stored wasr.eventsRead.Load()which is not correct. Checkpointing is done periodically and generally the value ofeventsReadversuseventsAppliedwill be larger in the PCSM status because some of the events are yet to be applied and PCSM can do a checkpoint with that difference.But when we perform the recovery, we need to restore
eventsReadto be the value ofeventsAppliedsince all the difference, or all the events that we read that did not make to apply when we crashed, are gone and will be re-read when we recover from the last checkpoint.